Search results for "File format"

showing 10 items of 22 documents

FASTdoop: A versatile and efficient library for the input of FASTA and FASTQ files for MapReduce Hadoop bioinformatics applications

2017

Abstract Summary MapReduce Hadoop bioinformatics applications require the availability of special-purpose routines to manage the input of sequence files. Unfortunately, the Hadoop framework does not provide any built-in support for the most popular sequence file formats like FASTA or BAM. Moreover, the development of these routines is not easy, both because of the diversity of these formats and the need for managing efficiently sequence datasets that may count up to billions of characters. We present FASTdoop, a generic Hadoop library for the management of FASTA and FASTQ files. We show that, with respect to analogous input management routines that have appeared in the Literature, it offers…

0301 basic medicineFASTQ formatStatistics and ProbabilityComputer scienceSequence analysismedia_common.quotation_subjectInformation Storage and RetrievalBioinformaticscomputer.software_genreGenomeBiochemistryDomain (software engineering)03 medical and health sciencesComputational Theory and MathematicHumansGenomic libraryQuality (business)DNA sequencingFASTQ; NGS; FASTQ; DNA sequencingMolecular Biologymedia_commonGene LibrarySequenceDatabaseSettore INF/01 - InformaticaGenome HumanComputer Science Applications1707 Computer Vision and Pattern RecognitionGenomicsSequence Analysis DNAFASTQFile formatComputer Science ApplicationsStatistics and Probability; Biochemistry; Molecular Biology; Computer Science Applications1707 Computer Vision and Pattern Recognition; Computational Theory and Mathematics; Computational MathematicsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsNGSDatabase Management Systemscomputer
researchProduct

The Effects of Static and Dynamic Visual Representations as Aids for Primary School Children in Tasks of Auditory Discrimination of Sound Patterns. A…

2018

It has been proposed that non-conventional presentations of visual information could be very useful as a scaffolding strategy in the learning of Western music notation. As a result, this study has attempted to determine if there is any effect of static and dynamic presentation modes of visual information in the recognition of sound patterns. An intervention-based quasi-experimental design was adopted with two groups of fifth-grade students in a Spanish city. Students did tasks involving discrimination, auditory recognition and symbolic association of the sound patterns with non-musical representations, either static images (S group), or dynamic images (D group). The results showed neither s…

0301 basic medicineMusical notationmedia_common.quotation_subjecteducationEducació primàriaNotationEducationstatic-dynamic presentation discrimination of melodic patterns03 medical and health sciencesPresentationCovariateDigital learningAssociation (psychology)media_commonlcsh:T58.5-58.64lcsh:Information technologyGeneral Engineeringcomputer.file_formatMusic education030104 developmental biologymusic educationImage file formatslcsh:LPsychologybimodalitycomputerlcsh:EducationCognitive psychologyInternational Journal of Emerging Technologies in Learning (iJET)
researchProduct

The Human Proteome Organization–Proteomics Standards Initiative Quality Control Working Group: Making quality control more accessible for biological …

2017

To have confidence in results acquired during biological mass spectrometry experiments, a systematic approach to quality control is of vital importance. Nonetheless, until now, only scattered initiatives have been undertaken to this end, and these individual efforts have often not been complementary. To address this issue, the Human Proteome Organization–Proteomics Standards Initiative has established a new working group on quality control at its meeting in the spring of 2016. The goal of this working group is to provide a unifying framework for quality control data. The initial focus will be on providing a community-driven standardized file format for quality control. For this purpose, the…

0301 basic medicineProteomicsQuality ControlProteomics Standards InitiativeProteomeChemistrymedia_common.quotation_subjectControl (management)File formatData scienceMass SpectrometryAnalytical ChemistryVariety (cybernetics)03 medical and health sciences030104 developmental biologyControl dataHuman proteome projectHumansUse caseQuality (business)Databases Proteinmedia_common
researchProduct

CoverageAnalyzer (CAn): A Tool for Inspection of Modification Signatures in RNA Sequencing Profiles

2016

Combination of reverse transcription (RT) and deep sequencing has emerged as a powerful instrument for the detection of RNA modifications, a field that has seen a recent surge in activity because of its importance in gene regulation. Recent studies yielded high-resolution RT signatures of modified ribonucleotides relying on both sequence-dependent mismatch patterns and reverse transcription arrests. Common alignment viewers lack specialized functionality, such as filtering, tailored visualization, image export and differential analysis. Consequently, the community will profit from a platform seamlessly connecting detailed visual inspection of RT signatures and automated screening for modifi…

0301 basic medicineRNA modifications; reverse transcription; reverse transcription (RT) signature; RNA sequencing (RNA-Seq); Next-Generation Sequencing (NGS); candidate screening; alignment viewerNext-Generation Sequencing (NGS)lcsh:QR1-502[ SDV.BBM.BM ] Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyBiologycomputer.software_genre01 natural sciencesBiochemistryField (computer science)Differential analysisDeep sequencinglcsh:MicrobiologyArticleWorld Wide Web03 medical and health sciencesUser-Computer InterfaceRNA modificationsRNA sequencing (RNA-Seq)[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]candidate screeningMolecular BiologyComputingMilieux_MISCELLANEOUS010405 organic chemistrySequence Analysis RNAGene Expression ProfilingRNAComputational BiologyHigh-Throughput Nucleotide Sequencing[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyreverse transcription (RT) signaturereverse transcriptionFile formatalignment viewer0104 chemical sciencesVisualizationVisual inspection030104 developmental biology[ SDV.BBM.GTP ] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Data miningcomputerSoftwareBiomolecules
researchProduct

Main Steps in Image Processing and Quantification: The Analysis Workflow

2019

In the last decades, the variety of programs, algorithms, and strategies that researchers have at their disposal to process and analyze image files has grown extensively. However, these are only pointless tools if not applied with the careful planning required to achieve a succesful image analysis. In order to do so, the analyst must establish a meaningful and effective sequence of orderly operations that is able to (1) overcome all the problems derived from the image manipulation and (2) successfully resolve the question that was originally posed. In this chapter, the authors suggest a set of strategies and present a reflection on the main milestones that compose the image processing workf…

0303 health sciencesReflection (computer programming)Process (engineering)Computer sciencebusiness.industryImage processingcomputer.file_formatVariety (cybernetics)Set (abstract data type)03 medical and health sciences0302 clinical medicineWorkflowImage file formatsSoftware engineeringbusinesscomputer030217 neurology & neurosurgery030304 developmental biology
researchProduct

A comparison of HDFS compact data formats: Avro versus Parquet

2017

In this paper, file formats like Avro and Parquet are compared with text formats to evaluate the performance of the data queries. Different data query patterns have been evaluated. Cloudera’s open-source Apache Hadoop distribution CDH 5.4 has been chosen for the experiments presented in this article. The results show that compact data formats (Avro and Parquet) take up less storage space when compared with plain text data formats because of binary data format and compression advantage. Furthermore, data queries from the column based data format Parquet are faster when compared with text data formats and Avro. Article in English. HDFS glaustųjų duomenų formatų palyginimas: Avro prieš Parquet…

Big DataComputer scienceBig dataEnergy Engineering and Power Technology02 engineering and technologyManagement Science and Operations Researchcomputer.software_genreColumn (database)020204 information systemsData query0202 electrical engineering electronic engineering information engineeringHDFSDatabasebusiness.industryPlain textMechanical Engineeringcomputer.file_formatAvroFile formatHiveParquetData formatHadoopBinary data020201 artificial intelligence & image processingbusinesscomputerMokslas – Lietuvos ateitis / Science – Future of Lithuania
researchProduct

A sensor-data-based denoising framework for hyperspectral images

2015

Many denoising approaches extend image processing to a hyperspectral cube structure, but do not take into account a sensor model nor the format of the recording. We propose a denoising framework for hyperspectral images that uses sensor data to convert an acquisition to a representation facilitating the noise-estimation, namely the photon-corrected image. This photon corrected image format accounts for the most common noise contributions and is spatially proportional to spectral radiance values. The subsequent denoising is based on an extended variational denoising model, which is suited for a Poisson distributed noise. A spatially and spectrally adaptive total variation regularisation term…

Blind deconvolution[ INFO.INFO-TS ] Computer Science [cs]/Signal and Image ProcessingHyperspectral imagingAnisotropic diffusionComputer scienceNoise reductionComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONImage processing02 engineering and technology01 natural sciences010309 opticsOptics[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing0103 physical sciencesdenoising0202 electrical engineering electronic engineering information engineeringbusiness.industryHyperspectral imagingcomputer.file_formatNon-local meansAtomic and Molecular Physics and OpticsLight intensityFull spectral imagingComputer Science::Computer Vision and Pattern Recognition020201 artificial intelligence & image processingImage file formatsNoise (video)businesscomputer
researchProduct

Software for simulating dichromatic perception of video streams

2013

We have designed a configurable stand-alone Matlab-based software to simulate dichromatic perception of video streams. The algorithm used is an extension for video streams of the “corresponding pair algorithm” by Capilla and coworkers for simulation of dichromatic perception of images. The software allows the user to upload a video sequence and to process it using different dichromatic color vision models and viewing conditions. The output video may be generated in different spatial and temporal resolutions and file formats. The functions for Matlab environment and a stand-alone application may be downloaded from the Repository of the University of Alicante. © 2013 Wiley Periodicals, Inc. C…

Color visionComputer scienceGeneral Chemical Engineeringmedia_common.quotation_subjectComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONColor vision modelHuman Factors and ErgonomicsUploadSoftwareComputer graphics (images)PerceptionPerception simulationComputer visionMATLABcomputer.programming_languagemedia_commonÓpticabusiness.industryProcess (computing)DichromatGeneral ChemistryVideo processingFile formatVideo processingArtificial intelligencebusinesscomputerSoftware
researchProduct

tbg - a new file format for genomic data

2021

AbstractMotivationThe question of determining whether a Single-Nucleotide Polymorphism (SNP) or a variant in general leads to a change in the amino acid sequence of a protein coding gene is often a laborious and time-consuming challenge. Here, we introduce the tbg file format for storing genomic data and tbg-tools, a user-friendly toolbox for the faster analysis of SNPs. The file format stores information for each nucleotide in each gene, allowing to predict which change in the amino acid sequence will be caused by a variant in the nucleotide sequence. Our new tool therefore has the potential to make biological sense of the unprecedented amount of genome-wide genetic variation that research…

Computer scienceGenetic variationNucleic acid sequenceSingle-nucleotide polymorphismComputational biologyLine (text file)Python (programming language)File formatPeptide sequencecomputerToolboxcomputer.programming_language
researchProduct

Three-domain image representation for personal photo album management

2010

In this paper we present a novel approach for personal photo album management. Pictures are analyzed and described in three representation spaces, namely, faces, background and time of capture. Faces are automatically detected and rectified using a probabilistic feature extraction technique. Face representation is then produced by computing PCA (Principal Component Analysis). Backgrounds are represented with low-level visual features based on RGB histogram and Gabor filter bank. Temporal data is obtained through the extraction of EXIF (Exchangeable image file format) data. Each image in the collection is then automatically organized using a mean-shift clustering technique. While many system…

Computer sciencebusiness.industryFeature extractionCBIR - Content Based Image Retrieval automatic image annotation personal photo album managementComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONImage processingcomputer.file_formatGabor filterAutomatic image annotationHistogramFace (geometry)RGB color modelComputer visionArtificial intelligenceImage file formatsImage sensorCluster analysisbusinesscomputer
researchProduct